Overview
Brought to you by YData
Dataset statistics
| Number of variables | 29 |
|---|---|
| Number of observations | 106 |
| Missing cells | 100 |
| Missing cells (%) | 3.3% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 24.1 KiB |
| Average record size in memory | 233.2 B |
Variable types
| Text | 10 |
|---|---|
| DateTime | 2 |
| Categorical | 11 |
| Numeric | 6 |
suffix has constant value "Unknown" | Constant |
state has constant value "Massachusetts" | Constant |
county is highly overall correlated with fips and 3 other fields | High correlation |
fips is highly overall correlated with county and 2 other fields | High correlation |
gender is highly overall correlated with prefix | High correlation |
healthcare_coverage is highly overall correlated with maiden | High correlation |
healthcare_expenses is highly overall correlated with maiden | High correlation |
income is highly overall correlated with income_category | High correlation |
income_category is highly overall correlated with income | High correlation |
lat is highly overall correlated with county | High correlation |
lon is highly overall correlated with county and 1 other fields | High correlation |
maiden is highly overall correlated with healthcare_coverage and 1 other fields | High correlation |
marital is highly overall correlated with prefix | High correlation |
prefix is highly overall correlated with gender and 1 other fields | High correlation |
zip is highly overall correlated with county and 1 other fields | High correlation |
maiden is highly imbalanced (56.7%) | Imbalance |
race is highly imbalanced (59.2%) | Imbalance |
deathdate has 100 (94.3%) missing values | Missing |
id has unique values | Unique |
ssn has unique values | Unique |
address has unique values | Unique |
lat has unique values | Unique |
lon has unique values | Unique |
healthcare_expenses has unique values | Unique |
zip has 35 (33.0%) zeros | Zeros |
healthcare_coverage has 7 (6.6%) zeros | Zeros |
Reproduction
| Analysis started | 2024-12-02 13:18:13.660335 |
|---|---|
| Analysis finished | 2024-12-02 13:18:18.585945 |
| Duration | 4.93 seconds |
| Software version | ydata-profiling vv4.12.0 |
| Download configuration | config.json |
Variables
id
Text
Unique 
| Distinct | 106 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 36 |
|---|---|
| Median length | 36 |
| Mean length | 36 |
| Min length | 36 |
Unique
| Unique | 106 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 30a6452c-4297-a1ac-977a-6a23237c7b46 |
|---|---|
| 2nd row | 34a4dcc4-35fb-6ad5-ab98-be285c586a4f |
| 3rd row | 7179458e-d6e3-c723-2530-d4acfe1c2668 |
| 4th row | 37c177ea-4398-fb7a-29fa-70eb3d673876 |
| 5th row | 0fef2411-21f0-a269-82fb-c42b55471405 |
| Value | Count | Frequency (%) |
| 50ca7edb-0dee-35e6-5d8f-66fbcb0b37c1 | 1 | 0.9% |
| f339a5f7-0b09-3072-2b01-7c8e8ca2c1fc | 1 | 0.9% |
| 780fe740-20fb-07ee-1fbd-3fafa9f5df91 | 1 | 0.9% |
| cca2c7f0-a2aa-94e5-ccea-cb78a7d38652 | 1 | 0.9% |
| 3c7e37b0-c610-bc9a-d75a-f782e5dc7598 | 1 | 0.9% |
| 37713015-cfb5-bf1a-70eb-970101f32341 | 1 | 0.9% |
| d426334c-a982-3a31-7e0f-ca3c7fe01310 | 1 | 0.9% |
| cb1b46a1-9cb5-1187-ccc5-9fb7b98aa957 | 1 | 0.9% |
| d1622e8b-d26b-ec81-ffcb-ec4bf2af385b | 1 | 0.9% |
| 34a4dcc4-35fb-6ad5-ab98-be285c586a4f | 1 | 0.9% |
| Other values (96) | 96 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 424 | 11.1% |
| e | 238 | 6.2% |
| 6 | 223 | 5.8% |
| 0 | 220 | 5.8% |
| 7 | 220 | 5.8% |
| c | 219 | 5.7% |
| 1 | 218 | 5.7% |
| 4 | 218 | 5.7% |
| 3 | 216 | 5.7% |
| 2 | 215 | 5.6% |
| Other values (7) | 1405 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3816 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| - | 424 | 11.1% |
| e | 238 | 6.2% |
| 6 | 223 | 5.8% |
| 0 | 220 | 5.8% |
| 7 | 220 | 5.8% |
| c | 219 | 5.7% |
| 1 | 218 | 5.7% |
| 4 | 218 | 5.7% |
| 3 | 216 | 5.7% |
| 2 | 215 | 5.6% |
| Other values (7) | 1405 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3816 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| - | 424 | 11.1% |
| e | 238 | 6.2% |
| 6 | 223 | 5.8% |
| 0 | 220 | 5.8% |
| 7 | 220 | 5.8% |
| c | 219 | 5.7% |
| 1 | 218 | 5.7% |
| 4 | 218 | 5.7% |
| 3 | 216 | 5.7% |
| 2 | 215 | 5.6% |
| Other values (7) | 1405 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3816 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| - | 424 | 11.1% |
| e | 238 | 6.2% |
| 6 | 223 | 5.8% |
| 0 | 220 | 5.8% |
| 7 | 220 | 5.8% |
| c | 219 | 5.7% |
| 1 | 218 | 5.7% |
| 4 | 218 | 5.7% |
| 3 | 216 | 5.7% |
| 2 | 215 | 5.6% |
| Other values (7) | 1405 |
birthdate
Date
| Distinct | 100 |
|---|---|
| Distinct (%) | 94.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| Minimum | 1914-03-03 00:00:00 |
|---|---|
| Maximum | 2023-03-01 00:00:00 |
deathdate
Date
Missing 
| Distinct | 6 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 100 |
| Missing (%) | 94.3% |
| Memory size | 980.0 B |
| Minimum | 1972-06-01 00:00:00 |
|---|---|
| Maximum | 2022-01-17 00:00:00 |
ssn
Text
Unique 
| Distinct | 106 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 11 |
| Min length | 11 |
Unique
| Unique | 106 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 999-52-8591 |
|---|---|
| 2nd row | 999-75-3953 |
| 3rd row | 999-70-1925 |
| 4th row | 999-27-9779 |
| 5th row | 999-50-8977 |
| Value | Count | Frequency (%) |
| 999-27-5104 | 1 | 0.9% |
| 999-66-2146 | 1 | 0.9% |
| 999-71-1449 | 1 | 0.9% |
| 999-36-7955 | 1 | 0.9% |
| 999-74-5035 | 1 | 0.9% |
| 999-80-8977 | 1 | 0.9% |
| 999-80-9251 | 1 | 0.9% |
| 999-83-1974 | 1 | 0.9% |
| 999-55-3884 | 1 | 0.9% |
| 999-75-3953 | 1 | 0.9% |
| Other values (96) | 96 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9 | 387 | |
| - | 212 | |
| 3 | 77 | 6.6% |
| 5 | 75 | 6.4% |
| 7 | 70 | 6.0% |
| 1 | 69 | 5.9% |
| 8 | 65 | 5.6% |
| 4 | 64 | 5.5% |
| 2 | 60 | 5.1% |
| 6 | 55 | 4.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1166 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 9 | 387 | |
| - | 212 | |
| 3 | 77 | 6.6% |
| 5 | 75 | 6.4% |
| 7 | 70 | 6.0% |
| 1 | 69 | 5.9% |
| 8 | 65 | 5.6% |
| 4 | 64 | 5.5% |
| 2 | 60 | 5.1% |
| 6 | 55 | 4.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1166 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 9 | 387 | |
| - | 212 | |
| 3 | 77 | 6.6% |
| 5 | 75 | 6.4% |
| 7 | 70 | 6.0% |
| 1 | 69 | 5.9% |
| 8 | 65 | 5.6% |
| 4 | 64 | 5.5% |
| 2 | 60 | 5.1% |
| 6 | 55 | 4.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1166 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 9 | 387 | |
| - | 212 | |
| 3 | 77 | 6.6% |
| 5 | 75 | 6.4% |
| 7 | 70 | 6.0% |
| 1 | 69 | 5.9% |
| 8 | 65 | 5.6% |
| 4 | 64 | 5.5% |
| 2 | 60 | 5.1% |
| 6 | 55 | 4.7% |
drivers
Text
| Distinct | 85 |
|---|---|
| Distinct (%) | 80.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.5849057 |
| Min length | 7 |
Unique
| Unique | 84 ? |
|---|---|
| Unique (%) | 79.2% |
Sample
| 1st row | S99996852 |
|---|---|
| 2nd row | S99993577 |
| 3rd row | Unknown |
| 4th row | S99995100 |
| 5th row | Unknown |
| Value | Count | Frequency (%) |
| unknown | 22 | 20.8% |
| s99911538 | 1 | 0.9% |
| s99917166 | 1 | 0.9% |
| s99959101 | 1 | 0.9% |
| s99975537 | 1 | 0.9% |
| s99996852 | 1 | 0.9% |
| s99943171 | 1 | 0.9% |
| s99941458 | 1 | 0.9% |
| s99947055 | 1 | 0.9% |
| s99941595 | 1 | 0.9% |
| Other values (75) | 75 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9 | 291 | |
| S | 84 | 9.2% |
| n | 66 | 7.3% |
| 7 | 56 | 6.2% |
| 5 | 49 | 5.4% |
| 1 | 48 | 5.3% |
| 6 | 47 | 5.2% |
| 3 | 46 | 5.1% |
| 8 | 39 | 4.3% |
| 4 | 36 | 4.0% |
| Other values (6) | 148 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 910 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 9 | 291 | |
| S | 84 | 9.2% |
| n | 66 | 7.3% |
| 7 | 56 | 6.2% |
| 5 | 49 | 5.4% |
| 1 | 48 | 5.3% |
| 6 | 47 | 5.2% |
| 3 | 46 | 5.1% |
| 8 | 39 | 4.3% |
| 4 | 36 | 4.0% |
| Other values (6) | 148 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 910 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 9 | 291 | |
| S | 84 | 9.2% |
| n | 66 | 7.3% |
| 7 | 56 | 6.2% |
| 5 | 49 | 5.4% |
| 1 | 48 | 5.3% |
| 6 | 47 | 5.2% |
| 3 | 46 | 5.1% |
| 8 | 39 | 4.3% |
| 4 | 36 | 4.0% |
| Other values (6) | 148 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 910 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 9 | 291 | |
| S | 84 | 9.2% |
| n | 66 | 7.3% |
| 7 | 56 | 6.2% |
| 5 | 49 | 5.4% |
| 1 | 48 | 5.3% |
| 6 | 47 | 5.2% |
| 3 | 46 | 5.1% |
| 8 | 39 | 4.3% |
| 4 | 36 | 4.0% |
| Other values (6) | 148 |
passport
Text
| Distinct | 76 |
|---|---|
| Distinct (%) | 71.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.0754717 |
| Min length | 7 |
Unique
| Unique | 75 ? |
|---|---|
| Unique (%) | 70.8% |
Sample
| 1st row | X47758697X |
|---|---|
| 2nd row | X28173268X |
| 3rd row | Unknown |
| 4th row | X83694889X |
| 5th row | Unknown |
| Value | Count | Frequency (%) |
| unknown | 31 | |
| x83694889x | 1 | 0.9% |
| x20976043x | 1 | 0.9% |
| x31272602x | 1 | 0.9% |
| x37637991x | 1 | 0.9% |
| x59458953x | 1 | 0.9% |
| x55687474x | 1 | 0.9% |
| x14417836x | 1 | 0.9% |
| x57524913x | 1 | 0.9% |
| x27670495x | 1 | 0.9% |
| Other values (66) | 66 |
Most occurring characters
| Value | Count | Frequency (%) |
| X | 150 | |
| n | 93 | |
| 7 | 70 | 7.3% |
| 4 | 67 | 7.0% |
| 6 | 64 | 6.7% |
| 3 | 63 | 6.5% |
| 9 | 58 | 6.0% |
| 8 | 58 | 6.0% |
| 1 | 56 | 5.8% |
| 5 | 55 | 5.7% |
| Other values (6) | 228 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 962 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| X | 150 | |
| n | 93 | |
| 7 | 70 | 7.3% |
| 4 | 67 | 7.0% |
| 6 | 64 | 6.7% |
| 3 | 63 | 6.5% |
| 9 | 58 | 6.0% |
| 8 | 58 | 6.0% |
| 1 | 56 | 5.8% |
| 5 | 55 | 5.7% |
| Other values (6) | 228 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 962 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| X | 150 | |
| n | 93 | |
| 7 | 70 | 7.3% |
| 4 | 67 | 7.0% |
| 6 | 64 | 6.7% |
| 3 | 63 | 6.5% |
| 9 | 58 | 6.0% |
| 8 | 58 | 6.0% |
| 1 | 56 | 5.8% |
| 5 | 55 | 5.7% |
| Other values (6) | 228 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 962 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| X | 150 | |
| n | 93 | |
| 7 | 70 | 7.3% |
| 4 | 67 | 7.0% |
| 6 | 64 | 6.7% |
| 3 | 63 | 6.5% |
| 9 | 58 | 6.0% |
| 8 | 58 | 6.0% |
| 1 | 56 | 5.8% |
| 5 | 55 | 5.7% |
| Other values (6) | 228 |
prefix
Categorical
High correlation 
| Distinct | 4 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| Mr. | |
|---|---|
| Mrs. | |
| Unknown | |
| Ms. |
Length
| Max length | 7 |
|---|---|
| Median length | 4 |
| Mean length | 4.2830189 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Mr. |
|---|---|
| 2nd row | Mr. |
| 3rd row | Unknown |
| 4th row | Mrs. |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Mr. | 31 | |
| Mrs. | 28 | |
| Unknown | 27 | |
| Ms. | 20 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| mr | 31 | |
| mrs | 28 | |
| unknown | 27 | |
| ms | 20 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 81 | |
| M | 79 | |
| . | 79 | |
| r | 59 | |
| s | 48 | |
| U | 27 | 5.9% |
| k | 27 | 5.9% |
| o | 27 | 5.9% |
| w | 27 | 5.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 454 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 81 | |
| M | 79 | |
| . | 79 | |
| r | 59 | |
| s | 48 | |
| U | 27 | 5.9% |
| k | 27 | 5.9% |
| o | 27 | 5.9% |
| w | 27 | 5.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 454 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 81 | |
| M | 79 | |
| . | 79 | |
| r | 59 | |
| s | 48 | |
| U | 27 | 5.9% |
| k | 27 | 5.9% |
| o | 27 | 5.9% |
| w | 27 | 5.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 454 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 81 | |
| M | 79 | |
| . | 79 | |
| r | 59 | |
| s | 48 | |
| U | 27 | 5.9% |
| k | 27 | 5.9% |
| o | 27 | 5.9% |
| w | 27 | 5.9% |
firstname
Text
| Distinct | 104 |
|---|---|
| Distinct (%) | 98.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 14 |
|---|---|
| Median length | 13 |
| Mean length | 8.8584906 |
| Min length | 5 |
Unique
| Unique | 102 ? |
|---|---|
| Unique (%) | 96.2% |
Sample
| 1st row | Joshua658 |
|---|---|
| 2nd row | Bennie663 |
| 3rd row | Hunter736 |
| 4th row | Carlyn477 |
| 5th row | Robin66 |
| Value | Count | Frequency (%) |
| hershel911 | 2 | 1.9% |
| homero668 | 2 | 1.9% |
| antonia30 | 1 | 0.9% |
| bennie663 | 1 | 0.9% |
| hunter736 | 1 | 0.9% |
| carlyn477 | 1 | 0.9% |
| robin66 | 1 | 0.9% |
| arthur650 | 1 | 0.9% |
| caryl47 | 1 | 0.9% |
| willian804 | 1 | 0.9% |
| Other values (94) | 94 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 82 | 8.7% |
| e | 65 | 6.9% |
| 6 | 51 | 5.4% |
| r | 50 | 5.3% |
| i | 50 | 5.3% |
| n | 48 | 5.1% |
| l | 36 | 3.8% |
| o | 35 | 3.7% |
| 4 | 34 | 3.6% |
| 7 | 32 | 3.4% |
| Other values (47) | 456 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 939 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 82 | 8.7% |
| e | 65 | 6.9% |
| 6 | 51 | 5.4% |
| r | 50 | 5.3% |
| i | 50 | 5.3% |
| n | 48 | 5.1% |
| l | 36 | 3.8% |
| o | 35 | 3.7% |
| 4 | 34 | 3.6% |
| 7 | 32 | 3.4% |
| Other values (47) | 456 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 939 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 82 | 8.7% |
| e | 65 | 6.9% |
| 6 | 51 | 5.4% |
| r | 50 | 5.3% |
| i | 50 | 5.3% |
| n | 48 | 5.1% |
| l | 36 | 3.8% |
| o | 35 | 3.7% |
| 4 | 34 | 3.6% |
| 7 | 32 | 3.4% |
| Other values (47) | 456 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 939 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 82 | 8.7% |
| e | 65 | 6.9% |
| 6 | 51 | 5.4% |
| r | 50 | 5.3% |
| i | 50 | 5.3% |
| n | 48 | 5.1% |
| l | 36 | 3.8% |
| o | 35 | 3.7% |
| 4 | 34 | 3.6% |
| 7 | 32 | 3.4% |
| Other values (47) | 456 |
middlename
Text
| Distinct | 89 |
|---|---|
| Distinct (%) | 84.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 14 |
|---|---|
| Median length | 13 |
| Mean length | 8.6981132 |
| Min length | 6 |
Unique
| Unique | 87 ? |
|---|---|
| Unique (%) | 82.1% |
Sample
| 1st row | Alvin56 |
|---|---|
| 2nd row | Unknown |
| 3rd row | Mckinley734 |
| 4th row | Florencia449 |
| 5th row | Jeramy610 |
| Value | Count | Frequency (%) |
| unknown | 17 | 15.9% |
| danita413 | 2 | 1.9% |
| mckinley734 | 1 | 0.9% |
| florencia449 | 1 | 0.9% |
| jeramy610 | 1 | 0.9% |
| lelia627 | 1 | 0.9% |
| shelton25 | 1 | 0.9% |
| jordan900 | 1 | 0.9% |
| salley758 | 1 | 0.9% |
| bobbi508 | 1 | 0.9% |
| Other values (80) | 80 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 99 | 10.7% |
| a | 76 | 8.2% |
| e | 50 | 5.4% |
| i | 47 | 5.1% |
| o | 45 | 4.9% |
| r | 42 | 4.6% |
| l | 32 | 3.5% |
| 7 | 32 | 3.5% |
| 4 | 30 | 3.3% |
| 8 | 30 | 3.3% |
| Other values (45) | 439 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 922 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 99 | 10.7% |
| a | 76 | 8.2% |
| e | 50 | 5.4% |
| i | 47 | 5.1% |
| o | 45 | 4.9% |
| r | 42 | 4.6% |
| l | 32 | 3.5% |
| 7 | 32 | 3.5% |
| 4 | 30 | 3.3% |
| 8 | 30 | 3.3% |
| Other values (45) | 439 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 922 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 99 | 10.7% |
| a | 76 | 8.2% |
| e | 50 | 5.4% |
| i | 47 | 5.1% |
| o | 45 | 4.9% |
| r | 42 | 4.6% |
| l | 32 | 3.5% |
| 7 | 32 | 3.5% |
| 4 | 30 | 3.3% |
| 8 | 30 | 3.3% |
| Other values (45) | 439 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 922 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 99 | 10.7% |
| a | 76 | 8.2% |
| e | 50 | 5.4% |
| i | 47 | 5.1% |
| o | 45 | 4.9% |
| r | 42 | 4.6% |
| l | 32 | 3.5% |
| 7 | 32 | 3.5% |
| 4 | 30 | 3.3% |
| 8 | 30 | 3.3% |
| Other values (45) | 439 |
lastname
Text
| Distinct | 93 |
|---|---|
| Distinct (%) | 87.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 14 |
|---|---|
| Median length | 12 |
| Mean length | 9.8679245 |
| Min length | 6 |
Unique
| Unique | 81 ? |
|---|---|
| Unique (%) | 76.4% |
Sample
| 1st row | Kunde533 |
|---|---|
| 2nd row | Ebert178 |
| 3rd row | Gerlach374 |
| 4th row | Williamson769 |
| 5th row | Gleichner915 |
| Value | Count | Frequency (%) |
| brakus656 | 3 | 2.8% |
| franecki195 | 2 | 1.9% |
| balistreri607 | 2 | 1.9% |
| carter549 | 2 | 1.9% |
| greenholt190 | 2 | 1.9% |
| schinner682 | 2 | 1.9% |
| schuppe920 | 2 | 1.9% |
| friesen796 | 2 | 1.9% |
| ebert178 | 2 | 1.9% |
| schiller186 | 2 | 1.9% |
| Other values (83) | 85 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 87 | 8.3% |
| r | 67 | 6.4% |
| n | 52 | 5.0% |
| i | 52 | 5.0% |
| a | 51 | 4.9% |
| 9 | 48 | 4.6% |
| l | 44 | 4.2% |
| 5 | 37 | 3.5% |
| 6 | 34 | 3.3% |
| s | 34 | 3.3% |
| Other values (47) | 540 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1046 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 87 | 8.3% |
| r | 67 | 6.4% |
| n | 52 | 5.0% |
| i | 52 | 5.0% |
| a | 51 | 4.9% |
| 9 | 48 | 4.6% |
| l | 44 | 4.2% |
| 5 | 37 | 3.5% |
| 6 | 34 | 3.3% |
| s | 34 | 3.3% |
| Other values (47) | 540 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1046 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 87 | 8.3% |
| r | 67 | 6.4% |
| n | 52 | 5.0% |
| i | 52 | 5.0% |
| a | 51 | 4.9% |
| 9 | 48 | 4.6% |
| l | 44 | 4.2% |
| 5 | 37 | 3.5% |
| 6 | 34 | 3.3% |
| s | 34 | 3.3% |
| Other values (47) | 540 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1046 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 87 | 8.3% |
| r | 67 | 6.4% |
| n | 52 | 5.0% |
| i | 52 | 5.0% |
| a | 51 | 4.9% |
| 9 | 48 | 4.6% |
| l | 44 | 4.2% |
| 5 | 37 | 3.5% |
| 6 | 34 | 3.3% |
| s | 34 | 3.3% |
| Other values (47) | 540 |
suffix
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| Unknown |
|---|
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Unknown |
|---|---|
| 2nd row | Unknown |
| 3rd row | Unknown |
| 4th row | Unknown |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Unknown | 106 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| unknown | 106 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 318 | |
| U | 106 | 14.3% |
| k | 106 | 14.3% |
| o | 106 | 14.3% |
| w | 106 | 14.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 742 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 318 | |
| U | 106 | 14.3% |
| k | 106 | 14.3% |
| o | 106 | 14.3% |
| w | 106 | 14.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 742 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 318 | |
| U | 106 | 14.3% |
| k | 106 | 14.3% |
| o | 106 | 14.3% |
| w | 106 | 14.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 742 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 318 | |
| U | 106 | 14.3% |
| k | 106 | 14.3% |
| o | 106 | 14.3% |
| w | 106 | 14.3% |
maiden
Categorical
High correlation  Imbalance 
| Distinct | 29 |
|---|---|
| Distinct (%) | 27.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| Unknown | |
|---|---|
| Rogahn59 | 1 |
| Lubowitz58 | 1 |
| Romaguera67 | 1 |
| Langosh790 | 1 |
| Other values (24) |
Length
| Max length | 14 |
|---|---|
| Median length | 7 |
| Mean length | 7.5660377 |
| Min length | 7 |
Unique
| Unique | 28 ? |
|---|---|
| Unique (%) | 26.4% |
Sample
| 1st row | Unknown |
|---|---|
| 2nd row | Unknown |
| 3rd row | Unknown |
| 4th row | Rogahn59 |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Unknown | 78 | |
| Rogahn59 | 1 | 0.9% |
| Lubowitz58 | 1 | 0.9% |
| Romaguera67 | 1 | 0.9% |
| Langosh790 | 1 | 0.9% |
| Wolf938 | 1 | 0.9% |
| Kshlerin58 | 1 | 0.9% |
| Durgan499 | 1 | 0.9% |
| Abreu185 | 1 | 0.9% |
| Hagenes547 | 1 | 0.9% |
| Other values (19) | 19 | 17.9% |
Length
| Value | Count | Frequency (%) |
| unknown | 78 | |
| rogahn59 | 1 | 0.9% |
| lubowitz58 | 1 | 0.9% |
| romaguera67 | 1 | 0.9% |
| langosh790 | 1 | 0.9% |
| wolf938 | 1 | 0.9% |
| kshlerin58 | 1 | 0.9% |
| durgan499 | 1 | 0.9% |
| abreu185 | 1 | 0.9% |
| hagenes547 | 1 | 0.9% |
| Other values (19) | 19 | 17.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 246 | |
| o | 90 | 11.2% |
| k | 81 | 10.1% |
| w | 79 | 9.9% |
| U | 78 | 9.7% |
| a | 19 | 2.4% |
| r | 17 | 2.1% |
| e | 16 | 2.0% |
| 1 | 12 | 1.5% |
| 9 | 11 | 1.4% |
| Other values (36) | 153 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 802 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 246 | |
| o | 90 | 11.2% |
| k | 81 | 10.1% |
| w | 79 | 9.9% |
| U | 78 | 9.7% |
| a | 19 | 2.4% |
| r | 17 | 2.1% |
| e | 16 | 2.0% |
| 1 | 12 | 1.5% |
| 9 | 11 | 1.4% |
| Other values (36) | 153 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 802 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 246 | |
| o | 90 | 11.2% |
| k | 81 | 10.1% |
| w | 79 | 9.9% |
| U | 78 | 9.7% |
| a | 19 | 2.4% |
| r | 17 | 2.1% |
| e | 16 | 2.0% |
| 1 | 12 | 1.5% |
| 9 | 11 | 1.4% |
| Other values (36) | 153 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 802 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 246 | |
| o | 90 | 11.2% |
| k | 81 | 10.1% |
| w | 79 | 9.9% |
| U | 78 | 9.7% |
| a | 19 | 2.4% |
| r | 17 | 2.1% |
| e | 16 | 2.0% |
| 1 | 12 | 1.5% |
| 9 | 11 | 1.4% |
| Other values (36) | 153 |
marital
Categorical
High correlation 
| Distinct | 5 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| Unknown | |
|---|---|
| M | |
| D | |
| S | |
| W | 4 |
Length
| Max length | 7 |
|---|---|
| Median length | 1 |
| Mean length | 3.3773585 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | D |
| 3rd row | Unknown |
| 4th row | M |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Unknown | 42 | |
| M | 33 | |
| D | 15 | 14.2% |
| S | 12 | 11.3% |
| W | 4 | 3.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| unknown | 42 | |
| m | 33 | |
| d | 15 | 14.2% |
| s | 12 | 11.3% |
| w | 4 | 3.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 126 | |
| U | 42 | 11.7% |
| k | 42 | 11.7% |
| o | 42 | 11.7% |
| w | 42 | 11.7% |
| M | 33 | 9.2% |
| D | 15 | 4.2% |
| S | 12 | 3.4% |
| W | 4 | 1.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 358 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 126 | |
| U | 42 | 11.7% |
| k | 42 | 11.7% |
| o | 42 | 11.7% |
| w | 42 | 11.7% |
| M | 33 | 9.2% |
| D | 15 | 4.2% |
| S | 12 | 3.4% |
| W | 4 | 1.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 358 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 126 | |
| U | 42 | 11.7% |
| k | 42 | 11.7% |
| o | 42 | 11.7% |
| w | 42 | 11.7% |
| M | 33 | 9.2% |
| D | 15 | 4.2% |
| S | 12 | 3.4% |
| W | 4 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 358 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 126 | |
| U | 42 | 11.7% |
| k | 42 | 11.7% |
| o | 42 | 11.7% |
| w | 42 | 11.7% |
| M | 33 | 9.2% |
| D | 15 | 4.2% |
| S | 12 | 3.4% |
| W | 4 | 1.1% |
race
Categorical
Imbalance 
| Distinct | 5 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| white | |
|---|---|
| asian | 8 |
| black | 6 |
| other | 3 |
| native | 1 |
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 5.009434 |
| Min length | 5 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.9% |
Sample
| 1st row | white |
|---|---|
| 2nd row | white |
| 3rd row | white |
| 4th row | asian |
| 5th row | white |
Common Values
| Value | Count | Frequency (%) |
| white | 88 | |
| asian | 8 | 7.5% |
| black | 6 | 5.7% |
| other | 3 | 2.8% |
| native | 1 | 0.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| white | 88 | |
| asian | 8 | 7.5% |
| black | 6 | 5.7% |
| other | 3 | 2.8% |
| native | 1 | 0.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 97 | |
| e | 92 | |
| t | 92 | |
| h | 91 | |
| w | 88 | |
| a | 23 | 4.3% |
| n | 9 | 1.7% |
| s | 8 | 1.5% |
| b | 6 | 1.1% |
| l | 6 | 1.1% |
| Other values (5) | 19 | 3.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 531 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 97 | |
| e | 92 | |
| t | 92 | |
| h | 91 | |
| w | 88 | |
| a | 23 | 4.3% |
| n | 9 | 1.7% |
| s | 8 | 1.5% |
| b | 6 | 1.1% |
| l | 6 | 1.1% |
| Other values (5) | 19 | 3.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 531 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 97 | |
| e | 92 | |
| t | 92 | |
| h | 91 | |
| w | 88 | |
| a | 23 | 4.3% |
| n | 9 | 1.7% |
| s | 8 | 1.5% |
| b | 6 | 1.1% |
| l | 6 | 1.1% |
| Other values (5) | 19 | 3.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 531 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 97 | |
| e | 92 | |
| t | 92 | |
| h | 91 | |
| w | 88 | |
| a | 23 | 4.3% |
| n | 9 | 1.7% |
| s | 8 | 1.5% |
| b | 6 | 1.1% |
| l | 6 | 1.1% |
| Other values (5) | 19 | 3.6% |
ethnicity
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| nonhispanic | |
|---|---|
| hispanic |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.54717 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | nonhispanic |
|---|---|
| 2nd row | nonhispanic |
| 3rd row | nonhispanic |
| 4th row | nonhispanic |
| 5th row | nonhispanic |
Common Values
| Value | Count | Frequency (%) |
| nonhispanic | 90 | |
| hispanic | 16 | 15.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| nonhispanic | 90 | |
| hispanic | 16 | 15.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 286 | |
| i | 212 | |
| h | 106 | 9.5% |
| s | 106 | 9.5% |
| a | 106 | 9.5% |
| p | 106 | 9.5% |
| c | 106 | 9.5% |
| o | 90 | 8.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1118 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 286 | |
| i | 212 | |
| h | 106 | 9.5% |
| s | 106 | 9.5% |
| a | 106 | 9.5% |
| p | 106 | 9.5% |
| c | 106 | 9.5% |
| o | 90 | 8.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1118 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 286 | |
| i | 212 | |
| h | 106 | 9.5% |
| s | 106 | 9.5% |
| a | 106 | 9.5% |
| p | 106 | 9.5% |
| c | 106 | 9.5% |
| o | 90 | 8.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1118 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 286 | |
| i | 212 | |
| h | 106 | 9.5% |
| s | 106 | 9.5% |
| a | 106 | 9.5% |
| p | 106 | 9.5% |
| c | 106 | 9.5% |
| o | 90 | 8.1% |
gender
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| F | |
|---|---|
| M |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | M |
| 3rd row | M |
| 4th row | F |
| 5th row | M |
Common Values
| Value | Count | Frequency (%) |
| F | 59 | |
| M | 47 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| f | 59 | |
| m | 47 |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 59 | |
| M | 47 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 106 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| F | 59 | |
| M | 47 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 106 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| F | 59 | |
| M | 47 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 106 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| F | 59 | |
| M | 47 |
birthplace
Text
| Distinct | 83 |
|---|---|
| Distinct (%) | 78.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 40 |
|---|---|
| Median length | 36 |
| Mean length | 27.707547 |
| Min length | 18 |
Unique
| Unique | 70 ? |
|---|---|
| Unique (%) | 66.0% |
Sample
| 1st row | Boston Massachusetts US |
|---|---|
| 2nd row | Chicopee Massachusetts US |
| 3rd row | Spencer Massachusetts US |
| 4th row | Franklin Massachusetts US |
| 5th row | Brockton Massachusetts US |
| Value | Count | Frequency (%) |
| us | 91 | |
| massachusetts | 91 | |
| boston | 11 | 3.1% |
| north | 6 | 1.7% |
| santiago | 6 | 1.7% |
| do | 4 | 1.1% |
| de | 3 | 0.9% |
| puerto | 3 | 0.9% |
| los | 3 | 0.9% |
| pr | 3 | 0.9% |
| Other values (106) | 129 |
Most occurring characters
| Value | Count | Frequency (%) |
| 456 | ||
| s | 408 | |
| a | 273 | 9.3% |
| t | 259 | 8.8% |
| e | 182 | 6.2% |
| h | 127 | 4.3% |
| u | 118 | 4.0% |
| c | 116 | 3.9% |
| S | 112 | 3.8% |
| o | 111 | 3.8% |
| Other values (39) | 775 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2937 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 456 | ||
| s | 408 | |
| a | 273 | 9.3% |
| t | 259 | 8.8% |
| e | 182 | 6.2% |
| h | 127 | 4.3% |
| u | 118 | 4.0% |
| c | 116 | 3.9% |
| S | 112 | 3.8% |
| o | 111 | 3.8% |
| Other values (39) | 775 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2937 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 456 | ||
| s | 408 | |
| a | 273 | 9.3% |
| t | 259 | 8.8% |
| e | 182 | 6.2% |
| h | 127 | 4.3% |
| u | 118 | 4.0% |
| c | 116 | 3.9% |
| S | 112 | 3.8% |
| o | 111 | 3.8% |
| Other values (39) | 775 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2937 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 456 | ||
| s | 408 | |
| a | 273 | 9.3% |
| t | 259 | 8.8% |
| e | 182 | 6.2% |
| h | 127 | 4.3% |
| u | 118 | 4.0% |
| c | 116 | 3.9% |
| S | 112 | 3.8% |
| o | 111 | 3.8% |
| Other values (39) | 775 |
address
Text
Unique 
| Distinct | 106 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 33 |
|---|---|
| Median length | 27 |
| Mean length | 20.367925 |
| Min length | 13 |
Unique
| Unique | 106 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 811 Kihn Viaduct |
|---|---|
| 2nd row | 975 Pfannerstill Throughway |
| 3rd row | 548 Heller Lane |
| 4th row | 160 Fadel Crossroad Apt 65 |
| 5th row | 766 Grant Loaf Unit 15 |
| Value | Count | Frequency (%) |
| unit | 16 | 4.0% |
| apt | 12 | 3.0% |
| suite | 12 | 3.0% |
| road | 4 | 1.0% |
| row | 4 | 1.0% |
| brook | 3 | 0.8% |
| esplanade | 3 | 0.8% |
| rapid | 3 | 0.8% |
| stravenue | 3 | 0.8% |
| street | 3 | 0.8% |
| Other values (289) | 337 |
Most occurring characters
| Value | Count | Frequency (%) |
| 294 | 13.6% | |
| e | 153 | 7.1% |
| a | 129 | 6.0% |
| r | 114 | 5.3% |
| i | 103 | 4.8% |
| n | 98 | 4.5% |
| t | 97 | 4.5% |
| o | 86 | 4.0% |
| s | 59 | 2.7% |
| l | 59 | 2.7% |
| Other values (49) | 967 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2159 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 294 | 13.6% | |
| e | 153 | 7.1% |
| a | 129 | 6.0% |
| r | 114 | 5.3% |
| i | 103 | 4.8% |
| n | 98 | 4.5% |
| t | 97 | 4.5% |
| o | 86 | 4.0% |
| s | 59 | 2.7% |
| l | 59 | 2.7% |
| Other values (49) | 967 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2159 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 294 | 13.6% | |
| e | 153 | 7.1% |
| a | 129 | 6.0% |
| r | 114 | 5.3% |
| i | 103 | 4.8% |
| n | 98 | 4.5% |
| t | 97 | 4.5% |
| o | 86 | 4.0% |
| s | 59 | 2.7% |
| l | 59 | 2.7% |
| Other values (49) | 967 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2159 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 294 | 13.6% | |
| e | 153 | 7.1% |
| a | 129 | 6.0% |
| r | 114 | 5.3% |
| i | 103 | 4.8% |
| n | 98 | 4.5% |
| t | 97 | 4.5% |
| o | 86 | 4.0% |
| s | 59 | 2.7% |
| l | 59 | 2.7% |
| Other values (49) | 967 |
city
Text
| Distinct | 78 |
|---|---|
| Distinct (%) | 73.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
Length
| Max length | 22 |
|---|---|
| Median length | 16 |
| Mean length | 8.7830189 |
| Min length | 4 |
Unique
| Unique | 61 ? |
|---|---|
| Unique (%) | 57.5% |
Sample
| 1st row | Braintree |
|---|---|
| 2nd row | Braintree |
| 3rd row | Mattapoisett |
| 4th row | Wareham |
| 5th row | Groveland |
| Value | Count | Frequency (%) |
| boston | 7 | 5.8% |
| west | 7 | 5.8% |
| braintree | 4 | 3.3% |
| yarmouth | 3 | 2.5% |
| tisbury | 3 | 2.5% |
| brookfield | 3 | 2.5% |
| somerville | 3 | 2.5% |
| north | 3 | 2.5% |
| lowell | 3 | 2.5% |
| springfield | 3 | 2.5% |
| Other values (70) | 82 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 100 | 10.7% |
| o | 83 | 8.9% |
| r | 75 | 8.1% |
| t | 74 | 7.9% |
| n | 63 | 6.8% |
| i | 51 | 5.5% |
| a | 50 | 5.4% |
| l | 50 | 5.4% |
| s | 37 | 4.0% |
| d | 34 | 3.7% |
| Other values (32) | 314 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 931 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 100 | 10.7% |
| o | 83 | 8.9% |
| r | 75 | 8.1% |
| t | 74 | 7.9% |
| n | 63 | 6.8% |
| i | 51 | 5.5% |
| a | 50 | 5.4% |
| l | 50 | 5.4% |
| s | 37 | 4.0% |
| d | 34 | 3.7% |
| Other values (32) | 314 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 931 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 100 | 10.7% |
| o | 83 | 8.9% |
| r | 75 | 8.1% |
| t | 74 | 7.9% |
| n | 63 | 6.8% |
| i | 51 | 5.5% |
| a | 50 | 5.4% |
| l | 50 | 5.4% |
| s | 37 | 4.0% |
| d | 34 | 3.7% |
| Other values (32) | 314 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 931 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 100 | 10.7% |
| o | 83 | 8.9% |
| r | 75 | 8.1% |
| t | 74 | 7.9% |
| n | 63 | 6.8% |
| i | 51 | 5.5% |
| a | 50 | 5.4% |
| l | 50 | 5.4% |
| s | 37 | 4.0% |
| d | 34 | 3.7% |
| Other values (32) | 314 |
state
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| Massachusetts |
|---|
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 13 |
| Min length | 13 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Massachusetts |
|---|---|
| 2nd row | Massachusetts |
| 3rd row | Massachusetts |
| 4th row | Massachusetts |
| 5th row | Massachusetts |
Common Values
| Value | Count | Frequency (%) |
| Massachusetts | 106 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| massachusetts | 106 |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 424 | |
| a | 212 | |
| t | 212 | |
| M | 106 | 7.7% |
| c | 106 | 7.7% |
| h | 106 | 7.7% |
| u | 106 | 7.7% |
| e | 106 | 7.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1378 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| s | 424 | |
| a | 212 | |
| t | 212 | |
| M | 106 | 7.7% |
| c | 106 | 7.7% |
| h | 106 | 7.7% |
| u | 106 | 7.7% |
| e | 106 | 7.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1378 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| s | 424 | |
| a | 212 | |
| t | 212 | |
| M | 106 | 7.7% |
| c | 106 | 7.7% |
| h | 106 | 7.7% |
| u | 106 | 7.7% |
| e | 106 | 7.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1378 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| s | 424 | |
| a | 212 | |
| t | 212 | |
| M | 106 | 7.7% |
| c | 106 | 7.7% |
| h | 106 | 7.7% |
| u | 106 | 7.7% |
| e | 106 | 7.7% |
county
Categorical
High correlation 
| Distinct | 13 |
|---|---|
| Distinct (%) | 12.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| Middlesex County | |
|---|---|
| Worcester County | |
| Essex County | |
| Norfolk County | |
| Plymouth County | |
| Other values (8) |
Length
| Max length | 17 |
|---|---|
| Median length | 16 |
| Mean length | 14.830189 |
| Min length | 12 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.9% |
Sample
| 1st row | Norfolk County |
|---|---|
| 2nd row | Norfolk County |
| 3rd row | Plymouth County |
| 4th row | Plymouth County |
| 5th row | Essex County |
Common Values
| Value | Count | Frequency (%) |
| Middlesex County | 29 | |
| Worcester County | 13 | |
| Essex County | 10 | 9.4% |
| Norfolk County | 9 | 8.5% |
| Plymouth County | 8 | 7.5% |
| Suffolk County | 8 | 7.5% |
| Hampden County | 8 | 7.5% |
| Bristol County | 7 | 6.6% |
| Barnstable County | 5 | 4.7% |
| Dukes County | 4 | 3.8% |
| Other values (3) | 5 | 4.7% |
Length
| Value | Count | Frequency (%) |
| county | 106 | |
| middlesex | 29 | 13.7% |
| worcester | 13 | 6.1% |
| essex | 10 | 4.7% |
| norfolk | 9 | 4.2% |
| plymouth | 8 | 3.8% |
| suffolk | 8 | 3.8% |
| hampden | 8 | 3.8% |
| bristol | 7 | 3.3% |
| barnstable | 5 | 2.4% |
| Other values (4) | 9 | 4.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 160 | 10.2% |
| t | 139 | 8.8% |
| u | 126 | 8.0% |
| n | 121 | 7.7% |
| e | 117 | 7.4% |
| y | 114 | 7.3% |
| C | 106 | 6.7% |
| 106 | 6.7% | |
| s | 82 | 5.2% |
| l | 67 | 4.3% |
| Other values (22) | 434 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1572 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 160 | 10.2% |
| t | 139 | 8.8% |
| u | 126 | 8.0% |
| n | 121 | 7.7% |
| e | 117 | 7.4% |
| y | 114 | 7.3% |
| C | 106 | 6.7% |
| 106 | 6.7% | |
| s | 82 | 5.2% |
| l | 67 | 4.3% |
| Other values (22) | 434 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1572 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 160 | 10.2% |
| t | 139 | 8.8% |
| u | 126 | 8.0% |
| n | 121 | 7.7% |
| e | 117 | 7.4% |
| y | 114 | 7.3% |
| C | 106 | 6.7% |
| 106 | 6.7% | |
| s | 82 | 5.2% |
| l | 67 | 4.3% |
| Other values (22) | 434 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1572 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 160 | 10.2% |
| t | 139 | 8.8% |
| u | 126 | 8.0% |
| n | 121 | 7.7% |
| e | 117 | 7.4% |
| y | 114 | 7.3% |
| C | 106 | 6.7% |
| 106 | 6.7% | |
| s | 82 | 5.2% |
| l | 67 | 4.3% |
| Other values (22) | 434 |
fips
Categorical
High correlation 
| Distinct | 14 |
|---|---|
| Distinct (%) | 13.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| Unknown | |
|---|---|
| 25017.0 | |
| 25021.0 | |
| 25027.0 | |
| 25013.0 | |
| Other values (9) |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | 3.8% |
Sample
| 1st row | 25021.0 |
|---|---|
| 2nd row | 25021.0 |
| 3rd row | Unknown |
| 4th row | Unknown |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Unknown | 35 | |
| 25017.0 | 24 | |
| 25021.0 | 8 | 7.5% |
| 25027.0 | 8 | 7.5% |
| 25013.0 | 7 | 6.6% |
| 25025.0 | 7 | 6.6% |
| 25009.0 | 5 | 4.7% |
| 25001.0 | 4 | 3.8% |
| 25003.0 | 2 | 1.9% |
| 25005.0 | 2 | 1.9% |
| Other values (4) | 4 | 3.8% |
Length
| Value | Count | Frequency (%) |
| unknown | 35 | |
| 25017.0 | 24 | |
| 25021.0 | 8 | 7.5% |
| 25027.0 | 8 | 7.5% |
| 25013.0 | 7 | 6.6% |
| 25025.0 | 7 | 6.6% |
| 25009.0 | 5 | 4.7% |
| 25001.0 | 4 | 3.8% |
| 25003.0 | 2 | 1.9% |
| 25005.0 | 2 | 1.9% |
| Other values (4) | 4 | 3.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 156 | |
| n | 105 | |
| 2 | 94 | |
| 5 | 80 | |
| . | 71 | |
| 1 | 46 | 6.2% |
| w | 35 | 4.7% |
| o | 35 | 4.7% |
| k | 35 | 4.7% |
| U | 35 | 4.7% |
| Other values (4) | 50 | 6.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 742 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 156 | |
| n | 105 | |
| 2 | 94 | |
| 5 | 80 | |
| . | 71 | |
| 1 | 46 | 6.2% |
| w | 35 | 4.7% |
| o | 35 | 4.7% |
| k | 35 | 4.7% |
| U | 35 | 4.7% |
| Other values (4) | 50 | 6.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 742 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 156 | |
| n | 105 | |
| 2 | 94 | |
| 5 | 80 | |
| . | 71 | |
| 1 | 46 | 6.2% |
| w | 35 | 4.7% |
| o | 35 | 4.7% |
| k | 35 | 4.7% |
| U | 35 | 4.7% |
| Other values (4) | 50 | 6.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 742 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 156 | |
| n | 105 | |
| 2 | 94 | |
| 5 | 80 | |
| . | 71 | |
| 1 | 46 | 6.2% |
| w | 35 | 4.7% |
| o | 35 | 4.7% |
| k | 35 | 4.7% |
| U | 35 | 4.7% |
| Other values (4) | 50 | 6.7% |
zip
Real number (ℝ)
High correlation  Zeros 
| Distinct | 61 |
|---|---|
| Distinct (%) | 57.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1317.8868 |
| Minimum | 0 |
|---|---|
| Maximum | 2861 |
| Zeros | 35 |
| Zeros (%) | 33.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 980.0 B |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1605 |
| Q3 | 2154.25 |
| 95-th percentile | 2636.25 |
| Maximum | 2861 |
| Range | 2861 |
| Interquartile range (IQR) | 2154.25 |
Descriptive statistics
| Standard deviation | 1012.0357 |
|---|---|
| Coefficient of variation (CV) | 0.7679231 |
| Kurtosis | -1.539062 |
| Mean | 1317.8868 |
| Median Absolute Deviation (MAD) | 602 |
| Skewness | -0.29374171 |
| Sum | 139696 |
| Variance | 1024216.3 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 35 | |
| 2184 | 4 | 3.8% |
| 2138 | 2 | 1.9% |
| 2492 | 2 | 1.9% |
| 1915 | 2 | 1.9% |
| 1030 | 2 | 1.9% |
| 2155 | 2 | 1.9% |
| 2180 | 2 | 1.9% |
| 2468 | 2 | 1.9% |
| 1585 | 2 | 1.9% |
| Other values (51) | 51 |
| Value | Count | Frequency (%) |
| 0 | 35 | |
| 1020 | 1 | 0.9% |
| 1030 | 2 | 1.9% |
| 1038 | 1 | 0.9% |
| 1040 | 1 | 0.9% |
| 1108 | 1 | 0.9% |
| 1129 | 1 | 0.9% |
| 1199 | 1 | 0.9% |
| 1201 | 1 | 0.9% |
| 1220 | 1 | 0.9% |
| Value | Count | Frequency (%) |
| 2861 | 1 | |
| 2743 | 1 | |
| 2718 | 1 | |
| 2675 | 1 | |
| 2673 | 1 | |
| 2638 | 1 | |
| 2631 | 1 | |
| 2492 | 2 | |
| 2476 | 1 | |
| 2472 | 1 |
lat
Real number (ℝ)
High correlation  Unique 
| Distinct | 106 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 42.242437 |
| Minimum | 41.296177 |
|---|---|
| Maximum | 42.816043 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 980.0 B |
Quantile statistics
| Minimum | 41.296177 |
|---|---|
| 5-th percentile | 41.650588 |
| Q1 | 42.103036 |
| median | 42.289558 |
| Q3 | 42.449431 |
| 95-th percentile | 42.685496 |
| Maximum | 42.816043 |
| Range | 1.519866 |
| Interquartile range (IQR) | 0.34639512 |
Descriptive statistics
| Standard deviation | 0.3301513 |
|---|---|
| Coefficient of variation (CV) | 0.0078156311 |
| Kurtosis | 0.36660563 |
| Mean | 42.242437 |
| Median Absolute Deviation (MAD) | 0.18650434 |
| Skewness | -0.82384894 |
| Sum | 4477.6983 |
| Variance | 0.10899988 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 42.42176351 | 1 | 0.9% |
| 42.21114203 | 1 | 0.9% |
| 41.43371899 | 1 | 0.9% |
| 42.55418125 | 1 | 0.9% |
| 42.0658866 | 1 | 0.9% |
| 42.11271886 | 1 | 0.9% |
| 42.48040619 | 1 | 0.9% |
| 42.34024487 | 1 | 0.9% |
| 42.04613216 | 1 | 0.9% |
| 42.23763486 | 1 | 0.9% |
| Other values (96) | 96 |
| Value | Count | Frequency (%) |
| 41.29617684 | 1 | |
| 41.35420569 | 1 | |
| 41.43371899 | 1 | |
| 41.49098336 | 1 | |
| 41.53862265 | 1 | |
| 41.64829219 | 1 | |
| 41.65747489 | 1 | |
| 41.66687013 | 1 | |
| 41.66850729 | 1 | |
| 41.68058212 | 1 |
| Value | Count | Frequency (%) |
| 42.81604282 | 1 | |
| 42.74496939 | 1 | |
| 42.73418302 | 1 | |
| 42.71912524 | 1 | |
| 42.69560171 | 1 | |
| 42.68816668 | 1 | |
| 42.67748594 | 1 | |
| 42.67584335 | 1 | |
| 42.65765376 | 1 | |
| 42.64079354 | 1 |
lon
Real number (ℝ)
High correlation  Unique 
| Distinct | 106 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -71.328445 |
| Minimum | -73.237785 |
|---|---|
| Maximum | -70.107363 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 106 |
| Negative (%) | 100.0% |
| Memory size | 980.0 B |
Quantile statistics
| Minimum | -73.237785 |
|---|---|
| 5-th percentile | -72.616709 |
| Q1 | -71.522152 |
| median | -71.148032 |
| Q3 | -71.002221 |
| 95-th percentile | -70.618146 |
| Maximum | -70.107363 |
| Range | 3.1304225 |
| Interquartile range (IQR) | 0.51993114 |
Descriptive statistics
| Standard deviation | 0.62739907 |
|---|---|
| Coefficient of variation (CV) | -0.0087959169 |
| Kurtosis | 0.94806933 |
| Mean | -71.328445 |
| Median Absolute Deviation (MAD) | 0.17323482 |
| Skewness | -1.0377175 |
| Sum | -7560.8151 |
| Variance | 0.39362959 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -71.00480709 | 1 | 0.9% |
| -71.04580218 | 1 | 0.9% |
| -70.66515525 | 1 | 0.9% |
| -71.14725027 | 1 | 0.9% |
| -72.25779487 | 1 | 0.9% |
| -70.98132689 | 1 | 0.9% |
| -70.8970629 | 1 | 0.9% |
| -71.08674318 | 1 | 0.9% |
| -71.00135889 | 1 | 0.9% |
| -71.018562 | 1 | 0.9% |
| Other values (96) | 96 |
| Value | Count | Frequency (%) |
| -73.23778524 | 1 | |
| -73.11924941 | 1 | |
| -72.65371344 | 1 | |
| -72.63252116 | 1 | |
| -72.62755691 | 1 | |
| -72.61762464 | 1 | |
| -72.61396084 | 1 | |
| -72.5987263 | 1 | |
| -72.56676186 | 1 | |
| -72.54947692 | 1 |
| Value | Count | Frequency (%) |
| -70.10736269 | 1 | |
| -70.17711054 | 1 | |
| -70.21884617 | 1 | |
| -70.23283428 | 1 | |
| -70.28956236 | 1 | |
| -70.61376021 | 1 | |
| -70.63130157 | 1 | |
| -70.65792862 | 1 | |
| -70.66515525 | 1 | |
| -70.71161606 | 1 |
healthcare_expenses
Real number (ℝ)
High correlation  Unique 
| Distinct | 106 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 202248.22 |
| Minimum | 527.54 |
|---|---|
| Maximum | 1157946.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 980.0 B |
Quantile statistics
| Minimum | 527.54 |
|---|---|
| 5-th percentile | 4963.3575 |
| Q1 | 29803.083 |
| median | 93132.02 |
| Q3 | 243127.97 |
| 95-th percentile | 838508.74 |
| Maximum | 1157946.9 |
| Range | 1157419.4 |
| Interquartile range (IQR) | 213324.89 |
Descriptive statistics
| Standard deviation | 263109.59 |
|---|---|
| Coefficient of variation (CV) | 1.3009242 |
| Kurtosis | 3.2083286 |
| Mean | 202248.22 |
| Median Absolute Deviation (MAD) | 80789.405 |
| Skewness | 1.9370994 |
| Sum | 21438311 |
| Variance | 6.9226658 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 655129.7 | 1 | 0.9% |
| 56904.96 | 1 | 0.9% |
| 1157946.95 | 1 | 0.9% |
| 65711 | 1 | 0.9% |
| 442640.29 | 1 | 0.9% |
| 226219.18 | 1 | 0.9% |
| 306535.33 | 1 | 0.9% |
| 143573.39 | 1 | 0.9% |
| 3895.86 | 1 | 0.9% |
| 56729.23 | 1 | 0.9% |
| Other values (96) | 96 |
| Value | Count | Frequency (%) |
| 527.54 | 1 | |
| 2460.22 | 1 | |
| 3895.86 | 1 | |
| 3969.77 | 1 | |
| 4416.06 | 1 | |
| 4871.79 | 1 | |
| 5238.06 | 1 | |
| 7380.74 | 1 | |
| 7617.33 | 1 | |
| 8531.67 | 1 |
| Value | Count | Frequency (%) |
| 1157946.95 | 1 | |
| 1068387.92 | 1 | |
| 976441.28 | 1 | |
| 955755.57 | 1 | |
| 934784.64 | 1 | |
| 849950.23 | 1 | |
| 804184.27 | 1 | |
| 754029.94 | 1 | |
| 677634.54 | 1 | |
| 655129.7 | 1 |
healthcare_coverage
Real number (ℝ)
High correlation  Zeros 
| Distinct | 100 |
|---|---|
| Distinct (%) | 94.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 298700.27 |
| Minimum | 0 |
|---|---|
| Maximum | 1441488.7 |
| Zeros | 7 |
| Zeros (%) | 6.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 980.0 B |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 13139.33 |
| median | 187893.31 |
| Q3 | 439327.89 |
| 95-th percentile | 987837.51 |
| Maximum | 1441488.7 |
| Range | 1441488.7 |
| Interquartile range (IQR) | 426188.56 |
Descriptive statistics
| Standard deviation | 342822.6 |
|---|---|
| Coefficient of variation (CV) | 1.1477144 |
| Kurtosis | 1.075759 |
| Mean | 298700.27 |
| Median Absolute Deviation (MAD) | 180653.3 |
| Skewness | 1.3171977 |
| Sum | 31662228 |
| Variance | 1.1752733 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 7 | 6.6% |
| 185658.8 | 1 | 0.9% |
| 9394.81 | 1 | 0.9% |
| 12883.79 | 1 | 0.9% |
| 3361.88 | 1 | 0.9% |
| 119973.56 | 1 | 0.9% |
| 26025.55 | 1 | 0.9% |
| 230741.1 | 1 | 0.9% |
| 190525.37 | 1 | 0.9% |
| 77976.38 | 1 | 0.9% |
| Other values (90) | 90 |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 539.02 | 1 | 0.9% |
| 640.19 | 1 | 0.9% |
| 693.39 | 1 | 0.9% |
| 1075.06 | 1 | 0.9% |
| 1941.56 | 1 | 0.9% |
| 3361.88 | 1 | 0.9% |
| 3414.68 | 1 | 0.9% |
| 4193.28 | 1 | 0.9% |
| 4560.98 | 1 | 0.9% |
| Value | Count | Frequency (%) |
| 1441488.68 | 1 | |
| 1280069.64 | 1 | |
| 1267791.2 | 1 | |
| 1106488.24 | 1 | |
| 1015294.53 | 1 | |
| 1009957.09 | 1 | |
| 921478.76 | 1 | |
| 900011.77 | 1 | |
| 884243.46 | 1 | |
| 878644.32 | 1 |
income
Real number (ℝ)
High correlation 
| Distinct | 100 |
|---|---|
| Distinct (%) | 94.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 106080.06 |
| Minimum | 7361 |
|---|---|
| Maximum | 816851 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 980.0 B |
Quantile statistics
| Minimum | 7361 |
|---|---|
| 5-th percentile | 10271.75 |
| Q1 | 35910.25 |
| median | 76761.5 |
| Q3 | 117661 |
| 95-th percentile | 198502 |
| Maximum | 816851 |
| Range | 809490 |
| Interquartile range (IQR) | 81750.75 |
Descriptive statistics
| Standard deviation | 139939.05 |
|---|---|
| Coefficient of variation (CV) | 1.3191834 |
| Kurtosis | 14.847463 |
| Mean | 106080.06 |
| Median Absolute Deviation (MAD) | 41088.5 |
| Skewness | 3.7350062 |
| Sum | 11244486 |
| Variance | 1.9582938 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90297 | 2 | 1.9% |
| 95344 | 2 | 1.9% |
| 7361 | 2 | 1.9% |
| 58212 | 2 | 1.9% |
| 92537 | 2 | 1.9% |
| 49737 | 2 | 1.9% |
| 163299 | 1 | 0.9% |
| 64743 | 1 | 0.9% |
| 550030 | 1 | 0.9% |
| 83325 | 1 | 0.9% |
| Other values (90) | 90 |
| Value | Count | Frequency (%) |
| 7361 | 2 | |
| 7873 | 1 | |
| 8615 | 1 | |
| 8752 | 1 | |
| 10135 | 1 | |
| 10682 | 1 | |
| 12128 | 1 | |
| 16969 | 1 | |
| 17382 | 1 | |
| 18258 | 1 |
| Value | Count | Frequency (%) |
| 816851 | 1 | |
| 762068 | 1 | |
| 742063 | 1 | |
| 550030 | 1 | |
| 545255 | 1 | |
| 198522 | 1 | |
| 198442 | 1 | |
| 189277 | 1 | |
| 188023 | 1 | |
| 179090 | 1 |
income_category
Categorical
High correlation 
| Distinct | 3 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 980.0 B |
| high-income | |
|---|---|
| low-income | |
| medium-income |
Length
| Max length | 13 |
|---|---|
| Median length | 11 |
| Mean length | 11.216981 |
| Min length | 10 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | high-income |
|---|---|
| 2nd row | medium-income |
| 3rd row | high-income |
| 4th row | low-income |
| 5th row | medium-income |
Common Values
| Value | Count | Frequency (%) |
| high-income | 45 | |
| low-income | 33 | |
| medium-income | 28 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| high-income | 45 | |
| low-income | 33 | |
| medium-income | 28 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 179 | |
| m | 162 | |
| o | 139 | |
| e | 134 | |
| n | 106 | |
| - | 106 | |
| c | 106 | |
| h | 90 | |
| g | 45 | 3.8% |
| l | 33 | 2.8% |
| Other values (3) | 89 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1189 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 179 | |
| m | 162 | |
| o | 139 | |
| e | 134 | |
| n | 106 | |
| - | 106 | |
| c | 106 | |
| h | 90 | |
| g | 45 | 3.8% |
| l | 33 | 2.8% |
| Other values (3) | 89 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1189 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 179 | |
| m | 162 | |
| o | 139 | |
| e | 134 | |
| n | 106 | |
| - | 106 | |
| c | 106 | |
| h | 90 | |
| g | 45 | 3.8% |
| l | 33 | 2.8% |
| Other values (3) | 89 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1189 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 179 | |
| m | 162 | |
| o | 139 | |
| e | 134 | |
| n | 106 | |
| - | 106 | |
| c | 106 | |
| h | 90 | |
| g | 45 | 3.8% |
| l | 33 | 2.8% |
| Other values (3) | 89 |
Interactions
Correlations
| county | ethnicity | fips | gender | healthcare_coverage | healthcare_expenses | income | income_category | lat | lon | maiden | marital | prefix | race | zip | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| county | 1.000 | 0.287 | 0.778 | 0.000 | 0.184 | 0.256 | 0.000 | 0.196 | 0.514 | 0.668 | 0.000 | 0.093 | 0.000 | 0.234 | 0.609 |
| ethnicity | 0.287 | 1.000 | 0.336 | 0.000 | 0.037 | 0.282 | 0.000 | 0.000 | 0.000 | 0.314 | 0.195 | 0.000 | 0.227 | 0.000 | 0.281 |
| fips | 0.778 | 0.336 | 1.000 | 0.237 | 0.000 | 0.000 | 0.108 | 0.248 | 0.323 | 0.584 | 0.000 | 0.013 | 0.000 | 0.000 | 0.798 |
| gender | 0.000 | 0.000 | 0.237 | 1.000 | 0.261 | 0.000 | 0.089 | 0.000 | 0.279 | 0.136 | 0.140 | 0.000 | 0.854 | 0.000 | 0.000 |
| healthcare_coverage | 0.184 | 0.037 | 0.000 | 0.261 | 1.000 | 0.490 | -0.070 | 0.196 | -0.195 | 0.199 | 0.658 | 0.387 | 0.400 | 0.158 | -0.063 |
| healthcare_expenses | 0.256 | 0.282 | 0.000 | 0.000 | 0.490 | 1.000 | 0.242 | 0.000 | -0.140 | 0.121 | 0.623 | 0.390 | 0.280 | 0.331 | -0.032 |
| income | 0.000 | 0.000 | 0.108 | 0.089 | -0.070 | 0.242 | 1.000 | 0.686 | -0.032 | 0.027 | 0.000 | 0.054 | 0.000 | 0.000 | -0.149 |
| income_category | 0.196 | 0.000 | 0.248 | 0.000 | 0.196 | 0.000 | 0.686 | 1.000 | 0.137 | 0.089 | 0.000 | 0.146 | 0.000 | 0.000 | 0.179 |
| lat | 0.514 | 0.000 | 0.323 | 0.279 | -0.195 | -0.140 | -0.032 | 0.137 | 1.000 | -0.260 | 0.409 | 0.098 | 0.118 | 0.324 | 0.073 |
| lon | 0.668 | 0.314 | 0.584 | 0.136 | 0.199 | 0.121 | 0.027 | 0.089 | -0.260 | 1.000 | 0.000 | 0.000 | 0.012 | 0.224 | 0.130 |
| maiden | 0.000 | 0.195 | 0.000 | 0.140 | 0.658 | 0.623 | 0.000 | 0.000 | 0.409 | 0.000 | 1.000 | 0.372 | 0.262 | 0.189 | 0.000 |
| marital | 0.093 | 0.000 | 0.013 | 0.000 | 0.387 | 0.390 | 0.054 | 0.146 | 0.098 | 0.000 | 0.372 | 1.000 | 0.565 | 0.000 | 0.000 |
| prefix | 0.000 | 0.227 | 0.000 | 0.854 | 0.400 | 0.280 | 0.000 | 0.000 | 0.118 | 0.012 | 0.262 | 0.565 | 1.000 | 0.000 | 0.000 |
| race | 0.234 | 0.000 | 0.000 | 0.000 | 0.158 | 0.331 | 0.000 | 0.000 | 0.324 | 0.224 | 0.189 | 0.000 | 0.000 | 1.000 | 0.151 |
| zip | 0.609 | 0.281 | 0.798 | 0.000 | -0.063 | -0.032 | -0.149 | 0.179 | 0.073 | 0.130 | 0.000 | 0.000 | 0.000 | 0.151 | 1.000 |
Missing values
Sample
| id | birthdate | deathdate | ssn | drivers | passport | prefix | firstname | middlename | lastname | suffix | maiden | marital | race | ethnicity | gender | birthplace | address | city | state | county | fips | zip | lat | lon | healthcare_expenses | healthcare_coverage | income | income_category | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 30a6452c-4297-a1ac-977a-6a23237c7b46 | 1994-02-06 | NaN | 999-52-8591 | S99996852 | X47758697X | Mr. | Joshua658 | Alvin56 | Kunde533 | Unknown | Unknown | M | white | nonhispanic | M | Boston Massachusetts US | 811 Kihn Viaduct | Braintree | Massachusetts | Norfolk County | 25021.0 | 2184 | 42.211142 | -71.045802 | 56904.96 | 18019.99 | 100511 | high-income |
| 1 | 34a4dcc4-35fb-6ad5-ab98-be285c586a4f | 1968-08-06 | 2009-12-11 | 999-75-3953 | S99993577 | X28173268X | Mr. | Bennie663 | Unknown | Ebert178 | Unknown | Unknown | D | white | nonhispanic | M | Chicopee Massachusetts US | 975 Pfannerstill Throughway | Braintree | Massachusetts | Norfolk County | 25021.0 | 2184 | 42.255420 | -70.971016 | 124024.12 | 1075.06 | 49737 | medium-income |
| 2 | 7179458e-d6e3-c723-2530-d4acfe1c2668 | 2008-12-21 | NaN | 999-70-1925 | Unknown | Unknown | Unknown | Hunter736 | Mckinley734 | Gerlach374 | Unknown | Unknown | Unknown | white | nonhispanic | M | Spencer Massachusetts US | 548 Heller Lane | Mattapoisett | Massachusetts | Plymouth County | Unknown | 0 | 41.648292 | -70.850619 | 45645.06 | 6154.94 | 133816 | high-income |
| 3 | 37c177ea-4398-fb7a-29fa-70eb3d673876 | 1994-01-27 | NaN | 999-27-9779 | S99995100 | X83694889X | Mrs. | Carlyn477 | Florencia449 | Williamson769 | Unknown | Rogahn59 | M | asian | nonhispanic | F | Franklin Massachusetts US | 160 Fadel Crossroad Apt 65 | Wareham | Massachusetts | Plymouth County | Unknown | 0 | 41.789096 | -70.711616 | 12895.15 | 659951.61 | 17382 | low-income |
| 4 | 0fef2411-21f0-a269-82fb-c42b55471405 | 2019-07-27 | NaN | 999-50-8977 | Unknown | Unknown | Unknown | Robin66 | Jeramy610 | Gleichner915 | Unknown | Unknown | Unknown | white | nonhispanic | M | Brockton Massachusetts US | 766 Grant Loaf Unit 15 | Groveland | Massachusetts | Essex County | Unknown | 0 | 42.734183 | -70.976410 | 18500.02 | 5493.57 | 52159 | medium-income |
| 5 | ec1a6cad-8825-7b5c-4e14-257c696d5f11 | 2019-04-18 | NaN | 999-13-4533 | Unknown | Unknown | Unknown | Arthur650 | Unknown | Roberts511 | Unknown | Unknown | Unknown | white | nonhispanic | M | Plymouth Massachusetts US | 866 Kulas Harbor | Cambridge | Massachusetts | Middlesex County | 25017.0 | 2138 | 42.377781 | -71.044112 | 14478.23 | 693.39 | 75767 | medium-income |
| 6 | 4569671e-ed39-055f-8e78-422b96c9896b | 2013-08-10 | NaN | 999-40-7708 | Unknown | Unknown | Unknown | Caryl47 | Lelia627 | Kassulke119 | Unknown | Unknown | Unknown | white | nonhispanic | F | East Falmouth Massachusetts US | 578 Dickens Camp | Arlington | Massachusetts | Middlesex County | 25017.0 | 2476 | 42.412276 | -71.202859 | 9821.14 | 27142.51 | 58294 | medium-income |
| 7 | c1acd7ba-dacf-36d2-6010-db8934400000 | 1968-08-06 | NaN | 999-97-4087 | S99911538 | X37637991X | Mr. | Willian804 | Shelton25 | Keeling57 | Unknown | Unknown | M | white | nonhispanic | M | Methuen Massachusetts US | 848 Ebert Knoll Unit 7 | Braintree | Massachusetts | Norfolk County | 25021.0 | 2184 | 42.214009 | -71.004896 | 175817.63 | 55473.97 | 49737 | medium-income |
| 8 | 3648fb36-1cd1-3641-0b1c-1f00d1e7e7de | 2006-07-02 | NaN | 999-78-1635 | S99943171 | Unknown | Ms. | Domenica436 | Unknown | Rau926 | Unknown | Unknown | Unknown | white | hispanic | F | Maynard Massachusetts US | 963 Senger Fort | Haverhill | Massachusetts | Essex County | 25009.0 | 1835 | 42.816043 | -71.051503 | 52933.16 | 11941.44 | 77756 | medium-income |
| 9 | 50ca7edb-0dee-35e6-5d8f-66fbcb0b37c1 | 1948-05-28 | NaN | 999-27-5104 | S99941458 | X59458953X | Mr. | Arnulfo253 | Jordan900 | Jaskolski867 | Unknown | Unknown | D | white | nonhispanic | M | Boston Massachusetts US | 757 Lockman Annex Apt 10 | Georgetown | Massachusetts | Essex County | Unknown | 0 | 42.695602 | -70.972510 | 242013.44 | 322768.97 | 35255 | low-income |
| id | birthdate | deathdate | ssn | drivers | passport | prefix | firstname | middlename | lastname | suffix | maiden | marital | race | ethnicity | gender | birthplace | address | city | state | county | fips | zip | lat | lon | healthcare_expenses | healthcare_coverage | income | income_category | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 96 | 98cbb02b-c16a-60e4-1ff0-37c0e45e0e9f | 2011-09-30 | NaN | 999-74-6516 | Unknown | Unknown | Unknown | Moses679 | Unknown | Friesen796 | Unknown | Unknown | Unknown | white | nonhispanic | M | Boston Massachusetts US | 689 Bailey Plaza Apt 88 | Brockton | Massachusetts | Plymouth County | 25023.0 | 2351 | 42.046132 | -71.001359 | 3895.86 | 285495.20 | 8615 | low-income |
| 97 | d6cc7569-5f31-9648-ec6a-e1162b32b183 | 2008-06-07 | NaN | 999-59-4941 | S99999666 | Unknown | Unknown | Diamond340 | Mirtha993 | Keebler762 | Unknown | Unknown | Unknown | white | nonhispanic | F | North Attleborough Massachusetts US | 905 Smitham Bay | Braintree | Massachusetts | Norfolk County | 25021.0 | 2184 | 42.237635 | -71.018562 | 56729.23 | 13905.95 | 94205 | high-income |
| 98 | 780fe740-20fb-07ee-1fbd-3fafa9f5df91 | 2009-08-20 | NaN | 999-71-1449 | Unknown | Unknown | Unknown | Stanton715 | Dion244 | Kassulke119 | Unknown | Unknown | Unknown | white | nonhispanic | M | Taunton Massachusetts US | 539 Grady Fork Suite 43 | Leominster | Massachusetts | Worcester County | 25027.0 | 1453 | 42.583332 | -71.817754 | 3969.77 | 55724.98 | 24218 | low-income |
| 99 | cca2c7f0-a2aa-94e5-ccea-cb78a7d38652 | 1972-01-25 | NaN | 999-36-7955 | S99988067 | X10446987X | Mrs. | Margarette462 | Britt177 | West559 | Unknown | Heidenreich818 | D | white | nonhispanic | F | Chelmsford Massachusetts US | 756 Schaefer Row Apt 84 | Yarmouth | Massachusetts | Barnstable County | Unknown | 0 | 41.666870 | -70.218846 | 119874.42 | 921478.76 | 124775 | high-income |
| 100 | 3c7e37b0-c610-bc9a-d75a-f782e5dc7598 | 2023-01-18 | NaN | 999-74-5035 | Unknown | Unknown | Unknown | Lael572 | Anitra287 | Schuppe920 | Unknown | Unknown | Unknown | white | nonhispanic | F | Boston Massachusetts US | 752 Simonis Gate Suite 16 | Holyoke | Massachusetts | Hampden County | 25013.0 | 1040 | 42.238973 | -72.613961 | 4871.79 | 0.00 | 545255 | high-income |
| 101 | 37713015-cfb5-bf1a-70eb-970101f32341 | 2018-04-09 | NaN | 999-80-8977 | Unknown | Unknown | Unknown | Yun266 | Norah104 | Ernser583 | Unknown | Unknown | Unknown | white | nonhispanic | F | Holliston Massachusetts US | 376 Ullrich Knoll Unit 86 | Fairhaven | Massachusetts | Bristol County | Unknown | 0 | 41.668507 | -70.897356 | 15979.43 | 4193.28 | 35486 | low-income |
| 102 | d426334c-a982-3a31-7e0f-ca3c7fe01310 | 1960-05-07 | NaN | 999-80-9251 | S99966941 | X9157439X | Mrs. | Anita473 | Berta524 | Sánchez310 | Unknown | Rodarte647 | W | white | hispanic | F | Santiago de los Caballeros Santiago DO | 977 White Row | Beverly | Massachusetts | Essex County | 25009.0 | 1915 | 42.520925 | -70.873600 | 955755.57 | 1280069.64 | 61016 | medium-income |
| 103 | cb1b46a1-9cb5-1187-ccc5-9fb7b98aa957 | 1982-12-09 | NaN | 999-83-1974 | S99951357 | X10229924X | Mr. | Grady603 | Delmar187 | Swaniawski813 | Unknown | Unknown | M | white | nonhispanic | M | Springfield Massachusetts US | 623 Crooks Street | Sharon | Massachusetts | Norfolk County | 25021.0 | 2067 | 42.143164 | -71.170529 | 302685.62 | 87202.01 | 63727 | medium-income |
| 104 | d1622e8b-d26b-ec81-ffcb-ec4bf2af385b | 1951-11-22 | 2017-08-18 | 999-55-3884 | S99996090 | X31384759X | Mrs. | Elna874 | Dian810 | Prohaska837 | Unknown | Bogisich202 | D | asian | nonhispanic | F | Fitchburg Massachusetts US | 574 Stanton Stravenue | Boston | Massachusetts | Suffolk County | 25025.0 | 2129 | 42.322290 | -71.025025 | 100734.69 | 1441488.68 | 92537 | high-income |
| 105 | f339a5f7-0b09-3072-2b01-7c8e8ca2c1fc | 1951-11-22 | NaN | 999-66-2146 | S99975537 | X26025438X | Ms. | Blanca837 | Allyn942 | Reinger292 | Unknown | Unknown | S | asian | nonhispanic | F | Millis-Clicquot Massachusetts US | 698 Hagenes Annex | Boston | Massachusetts | Suffolk County | 25025.0 | 2116 | 42.421764 | -71.004807 | 655129.70 | 381212.81 | 92537 | high-income |